Learning Shortest Paths in Word Graphs∗
نویسندگان
چکیده
In this paper we briefly sketch our work on text summarisation using compression graphs. The task is described as follows: Given a set of related sentences describing the same event, we aim at generating a single sentence that is simply structured, easily understandable, and minimal in terms of the number of words/tokens. Traditionally, sentence compression deals with finding the shortest path in word graphs in an unsupervised setting. The major drawback of this approach is the use of manually crafted heuristics for edge weights. By contrast, we cast sentence compression as a structured prediction problem. Edges of the compression graph are represented by features drawn from adjacent nodes so that corresponding weights are learned by a generalised linear model. Decoding is performed in polynomial time by a generalised shortest path algorithm using loss augmented inference. We report on preliminary results on artificial and real world data.
منابع مشابه
Multi-Sentence Compression: Finding Shortest Paths in Word Graphs
We consider the task of summarizing a cluster of related sentences with a short sentence which we call multi-sentence compression and present a simple approach based on shortest paths in word graphs. The advantage and the novelty of the proposed method is that it is syntaxlean and requires little more than a tokenizer and a tagger. Despite its simplicity, it is capable of generating grammatical...
متن کاملLearning to Summarise Related Sentences
We cast multi-sentence compression as a structured prediction problem. Related sentences are represented by a word graph so that summaries constitute paths in the graph (Filippova, 2010). We devise a parameterised shortest path algorithm that can be written as a generalised linear model in a joint space of word graphs and compressions. We use a large-margin approach to adapt parameterised edge ...
متن کاملReplacement Paths via Row Minima of Concise Matrices
Matrix M is k-concise if the finite entries of each column of M consist of k or fewer intervals of identical numbers. We give an O(n + m)-time algorithm to compute the row minima of any O(1)-concise n×m matrix. Our algorithm yields the first O(n+m)-time reductions from the replacement-paths problem on an n-node m-edge undirected graph (respectively, directed acyclic graph) to the single-source ...
متن کاملLearning Shortest Paths for Word Graphs
The vast amount of information on the Web drives the need for aggregation and summarisation techniques. We study event extraction as a text summarisation task using redundant sentences which is also known as sentence compression. Given a set of sentences describing the same event, we aim at generating a summarisation that is (i) a single sentence, (ii) simply structured and easily understandabl...
متن کاملThe All-Paths and Cycles Graph Kernel
With the recent rise in the amount of structured data available, there has been considerable interest in methods for machine learning with graphs. Many of these approaches have been kernel methods, which focus on measuring the similarity between graphs. These generally involving measuring the similarity of structural elements such as walks or paths. Borgwardt and Kriegel [1] proposed the all-pa...
متن کامل